LiDAR-Based Sensor Fusion SLAM and Localization for Autonomous Driving Vehicles in Complex Scenarios

LiDAR-based simultaneous localization and mapping (SLAM) and online localization methods are widely used in autonomous driving, and are key parts of intelligent vehicles. However, current SLAM algorithms have limitations in map drift and localization algorithms based on a single sensor have poor adaptability to complex scenarios. A SLAM and online localization method based on multi-sensor fusion is proposed and integrated into a general framework in this paper. In the mapping process, constraints consisting of normal distributions transform (NDT) registration, loop closure detection and real time kinematic (RTK) global navigation satellite system (GNSS) position for the front-end and the pose graph optimization algorithm for the back-end, which are applied to achieve an optimized map without drift. In the localization process, the error state Kalman filter (ESKF) fuses LiDAR-based localization position and vehicle states to realize more robust and precise localization. The open-source KITTI dataset and field tests are used to test the proposed method. The method effectiveness shown in the test results achieves 5–10 cm mapping accuracy and 20–30 cm localization accuracy, and it realizes online autonomous driving in complex scenarios.


Introduction
With the development of intelligent and connected vehicle technology, the intelligent transportation system with autonomous driving passenger cars, commercial vehicles and taxis has undergone tremendous changes in the perception of the complex scenarios. Vehicle localization is the key issue that should be solved in autonomous driving and how to realize high-precise vehicle localization under the condition of unavailable satellites or unstructured roads is one of the technical problems to be solved urgently. The localization technique based on the vision [1] and satellites observations can achieve centimeter-level localization but heavily rely on satellite signals, traffic signs, and initialization [2]. LiDARbased localization techniques are largely invariant to illumination and satellite signal changes. Therefore, high precision maps with denser point clouds are required, and the map-based multi-sensor fusion localization should be widely used to cover different driving conditions [3].
LiDAR SLAM is widely used in the construction of 3D point cloud maps [4]. The architecture of a simultaneous localization and mapping (SLAM) system consists of the front-end and the back-end. The front-end seeks to interpret the sensor data to obtain constraints as the basis for optimization approaches, such as point cloud registration, loop closure detection, or Global Navigation Satellite System (GNSS) pose. The back-end focuses on computing the best map result based on optimization techniques with the given constraints [5]. Many registration methods have been proposed for the front-end, such as the iterative closest point (ICP) [6], normal distribution transformation (NDT) [7], and feature-based [8]. However, typical registration methods suffer from drift in large-scale tests, due to the poor performance in the loop closure detection and the position correction

Related Work
In this section, a brief overview of algorithms related to LiDAR SLAM and multisensor fusion localization methods are introduced, including the point cloud registration algorithms, loop closure detection algorithms, pose graph algorithms, filter-based sensor fusion algorithms, and their interaction.
With the development of LiDAR SLAM, various registration algorithms have been proposed. The ICP algorithm is widely used in the registration of point cloud. Due to the improvement of computational efficiency and accuracy requirements, a variety of variant ICP algorithms have been derived [15]. However, the ICP is very sensitive to the initial guess. Different from the ICP, the NDT registration algorithm builds a statistical probability model of the point cloud, which is more efficient and accurate. Study [16] proposed a 3D-NDT registration algorithm as the improvement of the 2D-NDT algorithm [17] and compares qualitatively and quantitatively with the standard ICP algorithm. Results show The main contributions of this paper are summarized as follows.

•
The NDT registration, scan context-based loop closure detection and RTK-GNSS are integrated into a LiDAR SLAM framework and innovative use of pose graph to combine multiple methods to optimize position and reduce map drift. • LiDAR matching localization position and vehicle states are fused by ESKF, which takes full advantage of the vehicle velocity constraints of ground autonomous vehicles to optimize localization results and provide robust and accurate localization results.
• A general framework with mapping and localization is proposed, which is tested on the KITTI dataset [14] and real scenarios. Results demonstrate the effectiveness of the proposed framework.
The rest of the paper is structured as follows. The related work about mapping and localization is presented in Section 2. The offline mapping process is introduced in Section 3 and the online localization method introduced in Section 4. The experiment evaluation is given in Section 5. The discussion is given in Section 6. Finally, the conclusion and future work are presented in Section 7.

Related Work
In this section, a brief overview of algorithms related to LiDAR SLAM and multisensor fusion localization methods are introduced, including the point cloud registration algorithms, loop closure detection algorithms, pose graph algorithms, filter-based sensor fusion algorithms, and their interaction.
With the development of LiDAR SLAM, various registration algorithms have been proposed. The ICP algorithm is widely used in the registration of point cloud. Due to the improvement of computational efficiency and accuracy requirements, a variety of variant ICP algorithms have been derived [15]. However, the ICP is very sensitive to the initial guess. Different from the ICP, the NDT registration algorithm builds a statistical probability model of the point cloud, which is more efficient and accurate. Study [16] proposed a 3D-NDT registration algorithm as the improvement of the 2D-NDT algorithm [17] and compares qualitatively and quantitatively with the standard ICP algorithm. Results show that the method is faster and more reliable. Study [7] proposed an NDT-based SLAM method, which can achieve long-range high-precision map establishment and localization in dynamic scenarios. Li et al. [18] improved the accuracy of stereo visual SLAM by removing dynamic obstacles. Wen et al. [19] compared the performance of NDT-based graph optimization SLAM under complex urban conditions; the results show that the performance of the NDT SLAM algorithm is positively related to the traffic environment.
Loop closure is essential for correcting drift error and building a globally consistent map [5]. Visual-based loop closure detection is often limited by illumination variance and environment changes. The early LiDAR-based methods for place recognition focus on descriptors from structural information [20]. However, the descriptor method is limited by rotational invariance and poor point cloud resolution. Study [21] proposed a histogram method to address these problems but still causes false recognition. To address the aforementioned issues, studies [22,23] proposed the scan context method, which proposed a more efficient bin encoding function for place recognition and is widely used in LiDAR SLAM currently; in addition, the loop closure detection method based on deep learning has also been gradually applied to SLAM [24].
Graph-based optimization [25], which optimizes the full trajectory and map of the vehicle from the full set of measurements, has received attention in many studies in recent years. Some general frameworks and open-source implementation of a pose-graph method are proposed by [26,27]. Study [28] proposed a tutorial for the reader to implement graphbased SLAM. To improve the robustness of pose-graph SLAM, study [29] proposed a novel formulation that allows the back-end to change parts of the topological structure of the graph during the optimization process and progress experiments by loop closure constraints. To obtain accurate positions for mapping in large-scale environment, study [30] proposed global positioning system (GPS) and LiDAR odometry (GLO)-SLAM, which uses LiDAR to verify the reliability of GNSS, and the LiDAR odometry also will be optimized by means of reliable GPS data. In addition, study [31] added IMU/odometry pre-integration results under the framework of graph optimization, which effectively reduced navigation drift. With the development of deep learning, related technologies have also been applied to the field of SLAM [32,33].
The multi-sensor fusion method is usually used in SLAM and localization areas. Fusing multiple sensors and making the whole system accurate, robust, and applicable for various scenes is a challenging task. Study [34] integrated 2D LiDAR/IMU/GPS into a localization system for urban and indoor scenarios, IMU and RTK-GNSS for full scene localization, and vehicle velocity is good complementary information for localization, especially in satellites denied and environmental degradation conditions.

The Offline Mapping
The online LiDAR localization module relies on a pre-build map. The offline mapping aims to obtain a 3D point cloud map representation of the scenario. The NDT-based point cloud registration and scan context-based loop closure detection are innovatively combined into the front-end and the pose-graph is used in the back-end to optimize the map.

NDT Based Registration
Comparing with the ICP algorithm, the NDT divides the point cloud space into cells and each cell is continuously modeled as a Gaussian distribution. Taking the better calculation efficiency and registration accuracy of NDT into account, the NDT is chosen as the point cloud registration method. The process of NDT can be expressed as follows.
In the point cloud, point sets X and Y are the consecutive frames, X is the frame at the previous moment, Y is the frame at the next moment: Assuming that the transformation between X, Y can be expressed as follow: where T is the translation vector, R is the rotation vector.
The mean of all points in X can be expressed as: where N x is the number of points in the X. The covariance of X can be expressed as follow: Assuming that the transformation p makes point y i transform to y i , the transformation process can be expressed as the followed: After transformation, the point y i is in the same coordinate system as the target point set X, and its coincidence degree can be expressed as a Gaussian distribution: The joint probability distribution of Y and X can be expressed as follows: where N y is the number of points in the Y. The objective function can be expressed as follow: Therefore, the maximize of the joint probability ψ means that the transformation has the highest degree of coincidence and the optimization variables T and R represent the translation and rotation between two consecutive frames, respectively.

Scan Context Based Loop Closure Detection
Comparing with the feature descriptors of the environment, few studies focus on the structural information to describe scenes. Scan context proposes a non-histogram method of global descriptors, which directly records the 3D structure of the visible space and can be deployed in LiDAR-based place recognition effectively. The lightweight and efficient encoding method, which can improve the accuracy of loop closure detection, is conducive to storing point cloud information. The scan context method is applied for the offline mapping process. Firstly, scan context is used to detect the loop closure frame. After detecting the candidate frame in the historical frame, NDT is used to register the loop closure frame with the current point cloud frame to obtain the precise loop pose. Figure 2 shows the flow chart of scan context and loop closure detection. In the point cloud segmentation process, the point cloud space is cut into N r rings along the increasing radius and the rings are cut into N s sectors: where d r represents the width of the ring, L max represents the maximum measurement distance of LiDAR, N r is numbers of rings.
J. Imaging 2023, 9, x FOR PEER REVIEW 6 of 24 After segmentation, the segmented bin cells can be represented as a set P: where pij represents the set of midpoints of the i th circle segmentation unit of the j th sector.
In the generation of scan context process, the scan context I is represented as a Nr × Ns matrix, each element in the matrix represents the maximum value of all 3D points in the z-direction.
The distance function between two frames of point cloud scan context is defined as:  After segmentation, the segmented bin cells can be represented as a set P: where p ij represents the set of midpoints of the i th circle segmentation unit of the j th sector.
In the generation of scan context process, the scan context I is represented as a N r × N s matrix, each element in the matrix represents the maximum value of all 3D points in the z-direction.
The distance function between two frames of point cloud scan context is defined as: which can be used for similarity score, where I q is the current frame scan context, c q j is the j th of I q , I c is the historical frame scan context, and c c j is the j th of I c . To solve the problem that the current frame is rotated relative to the historical frame, the order of the column vectors in the scan context obtained at the current time is changed and causes a large-distance function between the two frames, the historical frame I c is translated by column, which will obtain N s scan contexts, calculate the distance from the scan context of the current frame in turn, and select the one with the smallest distance, which can be expressed as follows: The loop frame can be found by comparing the similarity of scan contexts between the current frame and the historical frame point cloud; when the distance function is less than a certain threshold, the historical frame is considered to be a loop frame.
In the precise position calculation process, scan context is used to calculate the rotation angle ϕ between the current frame and the loop frame to improve the calculation efficiency and accuracy of the NDT, and ϕ is used as the initial position for the NDT registration process:

RTK-GNSS Based Localization
Real time kinematic localization is a satellite navigation technique used to enhance the precision of localization data derived from satellite-based navigation systems. RTK relies on a single reference station to provide real-time corrections for the rover providing up to centimeter-level accuracy. There are indeed many situations with severe multipath and signal blockage under urban buildings or in forests. A stable and precise position and attitude can be provided for autonomous vehicles by combining RTK-GNSS and IMU.

Back-End Optimization
After interframe association and submap matching, there are inevitable cumulative errors in the point cloud map. The method of pose-graph optimization is used to further eliminate the cumulative errors, and the loop closure position and RTK-GNSS position will be used as constraints, the back-end optimization step is summarized in Algorithm 1. if meet optimization cycle times h then 4: execute optimization process: 5: x opt = arg min F(x i , x j , z i , z i,j ) 6: else 7: add RTK-GNSS position z i constraint 8: if loop closure position detected then 9: add loop closure position z i,j constraint 10: end if 11: end if 12: end for 13: return optimized vehicle position x opt

Graph Generation
The graph consists of edges and nodes, as shown in Figure 3; x i represents nodes from LiDAR odometry, z i represents prior position from RTK-GNSS, e i represents the edge between x i and z i . z ij represents the transformation of x j and x i , z ij represents expected observation from loop closure, and e ij represents the edge between z ij and z ij .

Graph Optimization
Graph optimization takes all the constraints into a non-linear optimization problem, which will consider all the observation measurements: where F(x) represents errors between all edges. Ωij is the matrix indicating the importance of each constraint in the global graph optimization. To adjust the state quantity x to minimize the value of the residual, it is necessary to obtain the Jacobian of the residual relative to state quantity, and then use the gradient descent method to optimize. The solution of this optimization is the xopt which satisfying the following function: To integrate the RTK-GNSS into the graph optimization, the error between LiDAR odometry xi and RTK-GNSS position zi can be represented as follows: The residual ei after adding disturbance term to the xi can be expressed as follows: The error between zij and zij ' can be represented as follows:

Graph Optimization
Graph optimization takes all the constraints into a non-linear optimization problem, which will consider all the observation measurements: where F(x) represents errors between all edges. Ω ij is the matrix indicating the importance of each constraint in the global graph optimization. To adjust the state quantity x to minimize the value of the residual, it is necessary to obtain the Jacobian of the residual relative to state quantity, and then use the gradient descent method to optimize. The solution of this optimization is the x opt which satisfying the following function: To integrate the RTK-GNSS into the graph optimization, the error between LiDAR odometry x i and RTK-GNSS position z i can be represented as follows: J. Imaging 2023, 9, 52 8 of 23 The residual e i after adding disturbance term to the x i can be expressed as follows: The error between z ij and z ij can be represented as follows: The residual e ij after adding disturbance to the x i and x j can be expressed as follows: The residual is expanded after adding disturbance term, and the Jacobian matrix J of the residual with respect to the state quantity can be obtained.
A first-order taylor expansion on the residuals can be expressed as follows: The state quantity x opt after correction can be expressed as follows:

The Online Localization
A multi-sensor fusion localization method based on the ESKF is proposed, which will estimate the vehicle position and attitude (PA) jointly by fusing vehicle states and LiDAR localization pose.

LiDAR Localization Based on 3D Point Cloud Map
The LiDAR localization based on a 3D point cloud map estimates the position of the vehicle in real-time, and the position can be used for the Kalman filter observation update. In this process, the RTK-GNSS position is used as the initial position for LiDAR localization to improve matching accuracy and efficiency. The NDT algorithm is used as registration method to match the real-time point cloud with the local map, the LiDAR localization step is summarized in Algorithm 2. if need update submap M sub then 5: update submap M sub 6: else 7: calculate position between p i and M sub : 8: NDT registration x lidar = p i ∝ M sub 9: end if 10: else 11: wait for initial position z i 12: end if 13: return LiDAR localization position x lidar

Filter State Equation
In the filter, the state variables error is expressed as follows: where δP is the position error, δV is the velocity error, φ is the attitude error, ε is the gyroscope bias error, and ∇ is the accelerometer bias error. The state transition equation in continuous time can be expressed as follows: According to the derivation of the IMU kinematics model, where The east-north-up (ENU) and right-forward-up (RFU) are chosen as the navigation reference frame n, and the body frame b, respectively, where f E is the acceleration in the east direction, f N is the acceleration in the north direction, f U is the acceleration in the up direction, and C n b is the direction cosine matrix from b frame to n frame: where ω is the angular velocity of the earth's rotation, L is the latitude, W includes the gyroscope noise ω g and accelerometer noise ω a : W = [w gx w gy w gz w ax w ay w az ] T (30)

Filter Measurement Update Equation
To compensate the loss of localization signal under complex driving scenarios and enhance the robustness of the localization system, LiDAR localization position and vehicle velocity are used as observation inputs: where δP L is the LiDAR localization position error, ϕ L is the attitude error, and δV v is the vehicle velocity error. The observation equation is as follows: where N is the observation noise and can be expressed as follows:

Experimental Verification and Performance Analysis
This section introduces experiments with the KITTI dataset and field tests based on the proposed method.

The Experiment Based on KITTI Dataset
The KITTI dataset was jointly founded by the Karlsruhe Institute of Technology in Germany and the Toyota American Institute of Technology. It is currently the world's largest autonomous driving localization and computer vision algorithm evaluation dataset. It contains LiDAR data, IMU data, RTK-GNSS data, velocity data, and the localization groundtruth, which is used to evaluate the mapping and localization accuracy. KITTI has multiple sequence datasets for various scenarios; sequence 00 was used in this study. The mapping and localization result is shown in Figure 4.

Experimental Verification and Performance Analysis
This section introduces experiments with the KITTI dataset and field tests based on the proposed method.

The Experiment Based on KITTI Dataset
The KITTI dataset was jointly founded by the Karlsruhe Institute of Technology in Germany and the Toyota American Institute of Technology. It is currently the world's largest autonomous driving localization and computer vision algorithm evaluation dataset. It contains LiDAR data, IMU data, RTK-GNSS data, velocity data, and the localization groundtruth, which is used to evaluate the mapping and localization accuracy. KITTI has multiple sequence datasets for various scenarios; sequence 00 was used in this study. The mapping and localization result is shown in Figure 4. show the results of optimization performance, the abscissa in the figure represents the index of the data frame where the position is saved; in the following, the abscissa is represented by Index, which has the same meaning. In Figure 6, it can be seen that the optimized longitudinal, lateral and altitude error are reduced to centimeter-level, which effectively eliminates the cumulative error of the front-end odometry and improves the mapping accuracy.  show the results of optimization performance, the abscissa in the figure represents the index of the data frame where the position is saved; in the following, the abscissa is represented by Index, which has the same meaning. In Figure 6, it can be seen that the optimized longitudinal, lateral and altitude error are reduced to centimeter-level, which effectively eliminates the cumulative error of the front-end odometry and improves the mapping accuracy.
J. Imaging 2023, 9, x FOR PEER REVIEW 11 of 24 N n n n n n n n n n

Experimental Verification and Performance Analysis
This section introduces experiments with the KITTI dataset and field tests based on the proposed method.

The Experiment Based on KITTI Dataset
The KITTI dataset was jointly founded by the Karlsruhe Institute of Technology in Germany and the Toyota American Institute of Technology. It is currently the world's largest autonomous driving localization and computer vision algorithm evaluation dataset. It contains LiDAR data, IMU data, RTK-GNSS data, velocity data, and the localization groundtruth, which is used to evaluate the mapping and localization accuracy. KITTI has multiple sequence datasets for various scenarios; sequence 00 was used in this study. The mapping and localization result is shown in Figure 4. show the results of optimization performance, the abscissa in the figure represents the index of the data frame where the position is saved; in the following, the abscissa is represented by Index, which has the same meaning. In Figure 6, it can be seen that the optimized longitudinal, lateral and altitude error are reduced to centimeter-level, which effectively eliminates the cumulative error of the front-end odometry and improves the mapping accuracy.    Figure 7 shows the coincidence degree between the groundtruth estimated trajectory. It can be seen that there is a large deviation between the unoptimized trajectory and the groundtruth, the optimized trajectory error is significantly reduced. Table 1 shows the trajectory accuracy after optimization is significantly improved and basically coincides with the groundtruth, and the average error is about 10 cm.  The LiDAR localization result is fused with IMU and vehicle velocity to improve the localization accuracy in the case of scenario degradation. As shown in Figures 8 and 9, a localization performance test is conducted based on the prior KITTI point cloud map.  Figure 7 shows the coincidence degree between the groundtruth estimated trajectory. It can be seen that there is a large deviation between the unoptimized trajectory and the groundtruth, the optimized trajectory error is significantly reduced. Table 1 shows the trajectory accuracy after optimization is significantly improved and basically coincides with the groundtruth, and the average error is about 10 cm.
J. Imaging 2023, 9, x FOR PEER REVIEW 12 of 24 Figure 6. The optimized position error in longitudinal, lateral, and altitude directions of KITTI dataset, respectively; the error is significantly reduced after optimization. Figure 7 shows the coincidence degree between the groundtruth estimated trajectory. It can be seen that there is a large deviation between the unoptimized trajectory and the groundtruth, the optimized trajectory error is significantly reduced. Table 1 shows the trajectory accuracy after optimization is significantly improved and basically coincides with the groundtruth, and the average error is about 10 cm.  The LiDAR localization result is fused with IMU and vehicle velocity to improve the localization accuracy in the case of scenario degradation. As shown in Figures 8 and 9, a localization performance test is conducted based on the prior KITTI point cloud map.  The LiDAR localization result is fused with IMU and vehicle velocity to improve the localization accuracy in the case of scenario degradation. As shown in Figures 8 and 9, a localization performance test is conducted based on the prior KITTI point cloud map.
J. Imaging 2023, 9, x FOR PEER REVIEW 12 of 24 Figure 6. The optimized position error in longitudinal, lateral, and altitude directions of KITTI dataset, respectively; the error is significantly reduced after optimization. Figure 7 shows the coincidence degree between the groundtruth estimated trajectory. It can be seen that there is a large deviation between the unoptimized trajectory and the groundtruth, the optimized trajectory error is significantly reduced. Table 1 shows the trajectory accuracy after optimization is significantly improved and basically coincides with the groundtruth, and the average error is about 10 cm.

. Localization Performance Analysis Based on KITTI Dataset
The LiDAR localization result is fused with IMU and vehicle velocity to improve the localization accuracy in the case of scenario degradation. As shown in Figures 8 and 9, a localization performance test is conducted based on the prior KITTI point cloud map.   As shown in Figure 10 and Table 2, the maximum position error on the KITTI datase is within 35 cm, the average position error is within 20 cm, and a stable localization resul is maintained.

The Field Test Vehicle and Test Results
To further verify the effectiveness of the proposed method, a four-wheel steering and four-wheel hub motor drive vehicle is developed by our team. It is equipped with sensor for data collection and can feedback vehicle states information through the controller are network (CAN) bus.

Test Vehicle and Sensor Configuration
A 32 beams LiDAR, an RTK-GNSS system and an IMU are equipped on the testing intelligent electric vehicle. Sensor specifications of the test vehicle are shown in Figure 11 and Table 3. Gyro and accelerometer bias stability of IMU are 5 deg/h and 0.5 mg, respec tively. In addition, the vehicle velocity can be obtained from the on-board CAN bus, and the groundtruth is provided by RTK-GNSS system. As shown in Figure 10 and Table 2, the maximum position error on the KITTI dataset is within 35 cm, the average position error is within 20 cm, and a stable localization result is maintained.  As shown in Figure 10 and Table 2, the maximum position error on the KITTI datase is within 35 cm, the average position error is within 20 cm, and a stable localization resul is maintained.

The Field Test Vehicle and Test Results
To further verify the effectiveness of the proposed method, a four-wheel steering and four-wheel hub motor drive vehicle is developed by our team. It is equipped with sensors for data collection and can feedback vehicle states information through the controller area network (CAN) bus.

Test Vehicle and Sensor Configuration
A 32 beams LiDAR, an RTK-GNSS system and an IMU are equipped on the testing intelligent electric vehicle. Sensor specifications of the test vehicle are shown in Figure 11 and Table 3. Gyro and accelerometer bias stability of IMU are 5 deg/h and 0.5 mg, respec tively. In addition, the vehicle velocity can be obtained from the on-board CAN bus, and the groundtruth is provided by RTK-GNSS system.

The Field Test Vehicle and Test Results
To further verify the effectiveness of the proposed method, a four-wheel steering and four-wheel hub motor drive vehicle is developed by our team. It is equipped with sensors for data collection and can feedback vehicle states information through the controller area network (CAN) bus.

Test Vehicle and Sensor Configuration
A 32 beams LiDAR, an RTK-GNSS system and an IMU are equipped on the testing intelligent electric vehicle. Sensor specifications of the test vehicle are shown in Figure 11 and Table 3. Gyro and accelerometer bias stability of IMU are 5 deg/h and 0.5 mg, respectively. In addition, the vehicle velocity can be obtained from the on-board CAN bus, and the groundtruth is provided by RTK-GNSS system.  The proposed mapping method was tested in a real scenario to verify its performance. In the field test, the dataset was collected in the industrial park with scenario change.
As shown in Figures 12 and 13, two point cloud maps were constructed by offline mapping process, and the map drift was effectively eliminated after optimization. As shown in Figures 14 and 15, due to the use of graph optimization algorithm, the optimized trajectory basically coincides with the groundtruth. It can be seen from Figures 16 and 17 that the position error of the three axes before optimization gradually increases over 1 m, and the average error before the optimization is 1.6 m. As shown in Figures 18 and 19, the error of the three axes is reduced to centimeter-level after optimization, and the average error is 8 cm. It can be seen from the above analysis that the proposed mapping method can reduce the position error significantly and construct a globally consistent map.   The proposed mapping method was tested in a real scenario to verify its performance. In the field test, the dataset was collected in the industrial park with scenario change.
As shown in Figures 12 and 13, two point cloud maps were constructed by offline mapping process, and the map drift was effectively eliminated after optimization. As shown in Figures 14 and 15, due to the use of graph optimization algorithm, the optimized trajectory basically coincides with the groundtruth. It can be seen from Figures 16 and 17 that the position error of the three axes before optimization gradually increases over 1 m, and the average error before the optimization is 1.6 m. As shown in Figures 18 and 19, the error of the three axes is reduced to centimeter-level after optimization, and the average error is 8 cm. It can be seen from the above analysis that the proposed mapping method can reduce the position error significantly and construct a globally consistent map.
J. Imaging 2023, 9, x FOR PEER REVIEW 14 of 24 Figure 11. Test vehicle and sensors configuration. The proposed mapping method was tested in a real scenario to verify its performance. In the field test, the dataset was collected in the industrial park with scenario change.
As shown in Figures 12 and 13, two point cloud maps were constructed by offline mapping process, and the map drift was effectively eliminated after optimization. As shown in Figures 14 and 15, due to the use of graph optimization algorithm, the optimized trajectory basically coincides with the groundtruth. It can be seen from Figures 16 and 17 that the position error of the three axes before optimization gradually increases over 1 m, and the average error before the optimization is 1.6 m. As shown in Figures 18 and 19, the error of the three axes is reduced to centimeter-level after optimization, and the average error is 8 cm. It can be seen from the above analysis that the proposed mapping method can reduce the position error significantly and construct a globally consistent map.

Field Test Localization Performance Analysis
Based on the prior point cloud map, five different field tests were implemented to verify the localization performance. The field tests included different driving conditions and scenarios.
Five sets of field tests represent different driving conditions and travel distances. As shown in Figures 20-25, due to the small changes in the scenario, the fused trajectory basically coincides with the groundtruth under the normal driving scenario and the curve driving scenario of one lap, and the maximum error does not exceed 45 cm. From Figures  26-31, we conducted another two sets of experiments under different scenarios, due to changes in the environment, the LiDAR localization position has drifted, but the maximum positioning error after fusing the IMU and the vehicle velocity can still be controlled within 55 cm. From Figures 32-34, in the large-scale scenario, the localization performance is still robust. It can be seen from Table 4 that the average position error is within 30 cm, which meets the autonomous driving lane-level localization requirements. Field tests scenarios results show that the localization algorithm based on the prior point cloud map can achieve good performance.

Field Test Localization Performance Analysis
Based on the prior point cloud map, five different field tests were implemented to verify the localization performance. The field tests included different driving conditions and scenarios.
Five sets of field tests represent different driving conditions and travel distances. As shown in Figures 20-25, due to the small changes in the scenario, the fused trajectory basically coincides with the groundtruth under the normal driving scenario and the curve driving scenario of one lap, and the maximum error does not exceed 45 cm. From Figures 26-31, we conducted another two sets of experiments under different scenarios, due to changes in the environment, the LiDAR localization position has drifted, but the maximum positioning error after fusing the IMU and the vehicle velocity can still be controlled within 55 cm. From Figures 32-34, in the large-scale scenario, the localization performance is still robust. It can be seen from Table 4

Discussion
From the experimental results, this can all be summarized as the following:

Discussion
From the experimental results, this can all be summarized as the following:

Discussion
From the experimental results, this can all be summarized as the following: The offline mapping method proposed in this paper can effectively eliminate map drift, provide a mapping accuracy of 5-10 cm, and can be used in localization work to provide a stable and reliable map data source.
In the online localization process, we use the multi-sensor fusion method to achieve a positioning accuracy of 20-30 cm, which benefits from the good mapping accuracy and the design of the multi-sensor fusion model.

Conclusions and Future Work
This paper presented a LiDAR-based sensor fusion SLAM and localization method for autonomous vehicle offline mapping and online localization. In the mapping process, NDT registration, scan context loop closure detection and RTK-GNSS position are considered in front-end and back-end, innovative use of the pose graph to combine multiple methods to optimize position and reduce map drift, which realizes 5-10 cm mapping accuracy, and the map drift is eliminated effectively. In the online localization process, the ESKF is used to fuse complementary sensor information, such as LiDAR, IMU and vehicle velocity to achieve good localization accuracy in various challenging scenarios, which takes full advantage of the vehicle velocity constraints of ground autonomous vehicles to optimize localization results, and reaches 20-30 cm localization accuracy and shows environmental robustness. Such good and stable mapping and localization results can assist autonomous vehicles to safely complete navigation tasks in the lane. Furthermore, the proposed method can be used to fuse more sensors for offline mapping and online localization, respectively, facing different applications. In the future work, we will try to tightly couple the IMU and LiDAR to the mapping system, which can reduce the drift of the front-end, and improve the quality of the mapping.